Omiotis: A Thesaurus-Based Measure of Text Relatedness

نویسندگان

  • George Tsatsaronis
  • Iraklis Varlamis
  • Michalis Vazirgiannis
  • Kjetil Nørvåg
چکیده

In this paper we present a new approach for measuring the relatedness between text segments, based on implicit semantic links between their words, as offered by a word thesaurus, namely WordNet. The approach does not require any type of training, since it exploits only WordNet to devise the implicit semantic links between text words. The paper presents a prototype on-line demo of the measure, that can provide word-to-word relatedness values, even for words of different part of speech. In addition the demo allows for the computation of relatedness between text segments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Relatedness Based on a Word Thesaurus

The computation of relatedness between two fragments of text in an automated manner requires taking into account a wide range of factors pertaining to the meaning the two fragments convey, and the pairwise relations between their words. Without doubt, a measure of relatedness between text segments must take into account both the lexical and the semantic relatedness between words. Such a measure...

متن کامل

A Knowledge-Based Semantic Kernel for Text Classification

Typically, in textual document classification the documents are represented in the vector space using the “Bag of Words” (BOW ) approach. Despite its ease of use, BOW representation cannot handle word synonymy and polysemy problems and does not consider semantic relatedness between words. In this paper, we overcome the shortages of the BOW approach by embedding a known WordNet-based semantic re...

متن کامل

Random Walk on WordNet to Measure Lexical Semantic Relatedness

The need to determine semantic relatedness or its inverse, semantic distance, between two lexically expressed concepts is a problem that pervades much of natural language processing such as document summarization, information extraction and retrieval, word sense disambiguation and the automatic correction of word errors in text. Standard ways of measuring similarity between two words on a thesa...

متن کامل

Semantic smoothing for text clustering

In this paper we present a new semantic smoothing vector space kernel (S-VSM) for text documents clustering. In the suggested approach semantic relatedness between words is used to smooth the similarity and the representation of text documents. The basic hypothesis examined is that considering semantic relatedness between two text documents may improve the performance of the text document clust...

متن کامل

Fast Semantic Relatedness: WordNet: : Similarity vs Roget's Thesaurus

A Measure of Semantic Relatedness (MSR) automatically determines how close two words are in meaning. MSRs are used in such Natural Language Processing (NLP) problems as word-sense disambiguation or text summarization. To solve such problems may require millions of relatedness scores, but MSR run-time, clearly a major concern, has rarely been considered in NLP research. To evaluate an MSR, one o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009